Network Structure and Biased Variance Estimation in Respondent Driven Sampling
نویسندگان
چکیده
This paper explores bias in the estimation of sampling variance in Respondent Driven Sampling (RDS). Prior methodological work on RDS has focused on its problematic assumptions and the biases and inefficiencies of its estimators of the population mean. Nonetheless, researchers have given only slight attention to the topic of estimating sampling variance in RDS, despite the importance of variance estimation for the construction of confidence intervals and hypothesis tests. In this paper, we show that the estimators of RDS sampling variance rely on a critical assumption that the network is First Order Markov (FOM) with respect to the dependent variable of interest. We demonstrate, through intuitive examples, mathematical generalizations, and computational experiments that current RDS variance estimators will always underestimate the population sampling variance of RDS in empirical networks that do not conform to the FOM assumption. Analysis of 215 observed university and school networks from Facebook and Add Health indicates that the FOM assumption is violated in every empirical network we analyze, and that these violations lead to substantially biased RDS estimators of sampling variance. We propose and test two alternative variance estimators that show some promise for reducing biases, but which also illustrate the limits of estimating sampling variance with only partial information on the underlying population social network.
منابع مشابه
Network Structure and Biased Variance Estimation in RDS
This paper explores bias in the estimation of sampling variance in Respondent Driven Sampling (RDS). Prior methodological work on RDS has focused on its problematic assumptions and the biases and inefficiencies of its estimators of the population mean. Nonetheless, researchers have given only slight attention to the topic of estimating sampling variance in RDS, despite the importance of varianc...
متن کاملCorrection: Network Structure and Biased Variance Estimation in Respondent Driven Sampling
access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
متن کاملApproximate Bayesian Computation Estimator for Respondent-Driven Sampling
Respondent-driven sampling is a network-based technique to collect information and make estimation about behavior and composition of social groups in hidden population. The non-randomly selected samples prohibit the use of the sample mean as a statistically valid estimator. Researchers have proposed several asymptotically unbiased estimators, but many fail to realize that the high variance of t...
متن کاملSample Size Calculations for Population Size Estimation Studies Using Multiplier Methods With Respondent-Driven Sampling Surveys
BACKGROUND While guidance exists for obtaining population size estimates using multiplier methods with respondent-driven sampling surveys, we lack specific guidance for making sample size decisions. OBJECTIVE To guide the design of multiplier method population size estimation studies using respondent-driven sampling surveys to reduce the random error around the estimate obtained. METHODS Th...
متن کاملLinked Ego Networks: Improving estimate reliability and validity with respondent-driven sampling
Respondent-driven sampling (RDS) is currently widely used for the study of HIV/AIDS-related high risk populations. However, recent studies have shown that traditional RDS methods are likely to generate large variances and may be severely biased since the assumptions behind RDS are seldom fully met in real life. To improve estimation in RDS studies, we propose a new method to generate estimates ...
متن کامل